Binary Neural Networks Algorithms, Architectures, and Applications (Baochang Zhang, Sheng Xu, Mingbao Lin etc.)

CP-NAS: Child-Parent Neural Architecture Search for 1-bit CNNs

101

Parent

Child

Performance evaluation

Search space

Reduce

Parent weights

Child weights

MSE loss

CP NAS

CP Optimization

CP Model

Parent

Child

Binarized

FIGURE 4.5

The main framework of the Child-Parent model. The Child-Parent model focuses on bina-

rized architecture search (left) and binarized optimization (right).

Thus, we can deﬁne it for each operation of the sampled network as

z⁽^i,j⁾

k,t

=βP (AP,t −AC,t) + AC,t

(4.15)

where AP,t and AC,t represents the network performance calculated by the accuracy of the

full-precision model (Parent) and the binarized model (Child) on the validation dataset, and

βP is the hyperparameter to control performance loss. i,j represents the index of the node

to generate the edge (i, j) shown in Fig. 4.6, k is the operation index of the corresponding

edge and t represents the tth sampling process. Note that we used the performance of the

sampled network to evaluate the performance of the corresponding selected operations.

CP-NAS [304] not only uses the accuracy on the validation dataset to guide the search

process directly but also considers the information of the full-precision model to investigate

better the full potential of the binarized model that can ultimately be reached. Additional

details are provided in the following section.

As shown in Fig. 4.5, unlike the traditional teacher-student model [87], which transfers

the generalization ability of the ﬁrst model to a smaller model by using the class proba-

bilities as “soft targets,” the child-parent model focuses on the performance measure that

is particularly suitable for NAS-based network binarization. Furthermore, the loss function

for the teacher-student model is constrained to the feature map or the output, while ours

focuses on the kernel weights.

B-1

-1

Output

FIGURE 4.6

The cell architecture for CP-NAS. A cell includes 2 input nodes, 4 intermediate nodes, and

14 edges.